Technique for Finding and Investigating the Strongest Combinations of Cyberattacks on Smart Grid Infrastructure

Recently, smart grids have become a vector of the energy policy of many countries. Due to structural and operation features, smart grids are a constant target of combined and simultaneous cyberattacks. To maximize security and to optimize existing network schemes to prevent cyber intrusion, in this paper, we propose an approach to decision support in finding and identifying the most potent attack combinations that can set the system to maximum damage. The main purpose is to identify the most severe combinations of attacks on smart grid components that potentially can be implemented from the perspective of the attacker. In this context, the problem of finding weaknesses points in the network configuration of a smart grid and assessing the impact of events on cyberinfrastructure is considered. The technique for detecting and investigating the strongest combinations of cyberattacks on the smart grid network is given with an example of the analysis of the spread of pandemic software in a system with arbitrary structure.


INTRODUCTION
Cyber risk is a unique problem for the smart grid infrastructure since a cyberattack can easily move from the cyber sphere to the physical world.Talking about the cyberattacks in distributed intelligent electrical power systems, the following notes should be taken into account: • Several types of attacks can be launched simultaneously in the cyber-physical system of the smart grid infrastructure.
• Cybercriminals create various attacks depending on the simplicity of actions, the course of events, and less complexity in creating an attack to maximize harm.
• Given this, existing cyber threats should be considered both in the plane of physical components and in terms of related ICT components.
The most representative recent examples of the implementation of cyber threats are the blackouts in Ukraine in 2015 and 2016.They occurred as a result of a series of cyberattacks, as well as successful attacks by a group of hackers Dragonfly 2.0, which in 2017 gained access to several network interfaces of energy companies that are leveraged by operators for transmitting commands to equipment, such as circuit breakers.In 2017, the distribution of the NotPetya ransomware occurred at a transnational level.This intervention was recognised as the act of a full-scale cyber-war.A few hours after its first appearance, the worm went beyond Ukraine and intervened in many machines around the world "from a hospital in Pennsylvania to the Tasmanian chocolate factory" [1].Recent attacks on Maersk (2017) and DisTack (2018) lead to immense losses disrupting services in company networks all around the world.In April 2018, as a result of a cyberattack, four U.S. pipeline companies experienced a shutdown of their electronic systems that lasted for several days.
The list of ICT components of smart energy networks that should be considered as a potential source of vulnerabilities should include [2]: • Operating systems and components: generators, transformers, supervisory control and data acquisition systems (SCADA), energy management / distribution systems (EMS / DMS), programmable logic controllers (PLCs), substations, smart meters and other smart electrical devices.
• Endpoints: smart meters, EV, smartphones and other mobile devices.

II. MODELS OF MALWARE
Table II presents the types of cyber-physical attacks on smart grid components in terms of their impact on integrity, privacy, and accessibility.According to the classification given in [3], the characteristics and parameters of malware could be represented in terms of the pandemic, endemic, and infectious types.(see Table III).Contrary to [3], where the characteristics of malware distribution in networks are studied based on mathematical modeling, this study defines the components used to model the initial stages of malware distribution in such networks to analyze the availability of network branches to cyberattacks.Nimda [4], Slammer [6,7], Conficker [8,9] Regin [10], Duqu [11,12], Flame [11].
Gauss [11], Equation [13], AdWind [14,15], Grey Energy [16] It is worth noting that the taxonomy presented in Table III is conditional since malware can combine different types of characteristics, for example, Stuxnet virus [17] has many functions similar to the endemic category, but it does not use hit lists.The speed and scale of distribution of WannaCry [18] and NotPetya [19], which in 2017 infected more than 200,000 devices in more than 100 countries, causing more than $ 4 billion in losses in just 24 hours [20], makes it pandemic in while they use more sophisticated implementation and anti-tampering methods, such as XOR encryption and fake Microsoft digital signatures [21].The same statement applies to their predecessor, Petya, whose developers used the methods commonly used by penetration testers and hackers and built sophisticated multi-threaded automation of these methods in one piece of code.
The malware of GreyEnergy currently has no destructive capabilities and seems to focus on spyware and intelligence operations on control system workstations that work with SCADA software and servers, giving it a reason to classify it as a third category.However, GreyEnergy has a modular architecture, which means that its capabilities can be expanded [22].Also, ESET experts note that GreyEnergy has been involved in attacks on energy companies and other Ukrainian and Polish value units over the last three years.
Identification of potential high-risk vulnerabilities is a vital component of an advanced security strategy and should be part of the overall smart grid infrastructure management program.Early warning systems, risk management analytics, security monitoring systems and digital audit systems enable businesses and researchers to make better decisions.Traditional decision support techniques rely heavily on data analytics algorithms for security, cyberattack capabilities and vectors, data breaches, and more.However, the regulatory analysis showed that due to the technical difference between the IEs and the CVSS used by NIST, threat assessment could only be performed on IEC61850 computer nodes such as database servers, engineering stations, human-machine interfaces and gateways.Therefore, it is necessary to develop and use a new metric scheme capable of taking into account different levels of threats and auditing the security of smart energy systems under metric schemes.
In view of the above, in this work, we assumed that several simultaneous attacks on the similar smart grid system components could occur.The main purpose is to identify the strongest combinations of attacks on smart grid components that potentially can be implemented from the perspective of the attacker.

III. BASIC ASSUMPTIONS
One of the most challenging issues caused by cyber intrusions is cascading network outages due to the simultaneous attack on several nodes of a distributed power system.In this case, the goal of a cyberattack is to disconnect or switch network branches from an operational state to shutdown.Potential attacks on the network ci form the combination vector C: C = {c1, c2, c3, c4 ..., cm}.The elements of this vector include the sets of sequences of simultaneous disconnection of certain network branches and enable calculating the damage caused to the system due to the implementation appropriate combinations of cyberattacks.Then the time to turn off the power system t(ci) forms another vector: T = {t (c1), t (c2), t (c3), t (c4), ..., t (cm)}.
Using the introduced notations, a combinational attack is considered as powerful if a maximum number of disconnections of the branches of the system k ĺ E can be achieved in a minimum time t (ci) ĺ min.
For further consideration, the following assumptions are made: • Assumption 1: Any type of attack can be used to attack simultaneously.
• Assumption 2: All an attacker needs is to manipulate a relay or switch.A line switching attack can, for example, be triggered by initiating an emergency protection scheme or a corrective action scheme.
• Assumption 3: Attackers have the resources to attack multiple lines at the same time to initiate a simultaneous attack.
• Assumption 4: For n-k unforeseen circumstances, there should be a kmax value of N. It is assumed that the maximum kmax branches will fire simultaneously with a simultaneous attack.
• Assumption 3: The largest attack combination is considered for the minimum time required to achieve a power outage, calculated as a percentage of power lost (MW).
In this case, the percentage of lost MW is calculated using the formula below: where a is the total number of MW before the attack and b is the total number of MW after the attack.
The problem of finding weaknesses in the configuration of a distributed electricity network is formulated in the form of a connection prediction problem in its cyber graph.
For a non-directional graph G (V, E), where V is the set of nodes i, E is the set of edges e (i, j) connecting the nodes i and j, it is necessary to determine the distance function between nodes of the graph, which will guarantee for the structure of the graph G (t0, t0*), given in the interval of time (t0, t0*), to predict the structure of G (t1, t1*) in the interval (t1, t1*).
Due to the features of smart grid involve a dynamic network structure the PageRank importance indicator, which is commonly used for various social network prediction tasks [23], on the Internet, and cyber threat detection through link analysis [24] can be used for assessing the impact of events on cyber infrastructure of distributed power grids.The reasons for using the PageRank algorithm for calculating node criticality are: • The results are calculated using a stochastic approach that reflects the randomness in the evolution of the model.With regard to cyber-threat infrastructures, we assume that there is a constant evolution of smart grid cyberspace, in the form of new organizations, owners, IPs, servers, malware samples, domains, and registrars.This emergence of new peaks influences the evolution of network accessibility estimates for cybercriminals.
• The random model illustrates the access to the nodes of the graph with probability (damping factor).
Similar to changing cyber-threats infrastructure, the use of a probabilistic approach model is interesting because it enables to track potential actions taken through infected machines.For example, it is assumed that the compromised domain can be visited by the infected machine, the IP address can be connected to infected machines or connected to the server of the compromised domain, the FTP server can be used to download the stolen information, the SMTP server can be used to start spam or phishing campaigns, you can use the IRC channel to instruct bots to launch DDoS attacks, distribute malware, or other malicious activity, and more.

IV. TECHNIQUE FOR ASSESSING THE IMPACT OF EVENTS ON SMART GRID CYBER INFRASTRUCTURE
Given the scale and constantly changing the structure of smart grid networks, it is additionally suggested that attackers have the ability to attack any line by infecting the target node.Therefore, it is believed that both the purpose of the attack and the purpose of protection is to find the peaks that have the greatest impact on the infrastructure.Decisions about the strongest combinations of attacks capable of giving the system maximum damage are made by calculating the importance of the nodes of the graph and redistributing them according to the values obtained.
The analysis can be based on cyber graph models or a topological network diagram, which is also extrapolated as a graph.
Step 1. Create a matrix of branches.
The adjacency matrix of graph G with a finite number of nodes n is a square matrix of size n × n, which is formed from the elements of nodes whose values are equal to the weight Ȧij of the edge e (i, j). .
Step 2. Generate a combinational vector of attacks.
For k disconnections with E edges, the number of possible combinations is calculated by the following formula [25]: Step 3. Create a combination matrix of transitions.
A stochastic matrix is created for each attack vector, in which all columns are rows of real numbers from 0 to 1, giving in the sum 1: The importance of PR (vi) of the node associated with node vj and do(vi) -the number of output edges from node vi is calculated as [26]: where d is the damping factor (0.85).
In this case, it is assumed that if the attacker has reached node vj with probability d, then the probability of reaching another node vi is 1/do(vj).
As a result, one equation for a node with the corresponding number of unknown values of PR (vi) can be obtained in one step.
Assuming that The resulting model reflects access to elements of cyberinfrastructure and can be used to analyze potential attacks through infected channels.
Step 6. Analyze the impact of cyberattack combinations on the smart grid infrastructure.
The analysis calculates the possible damage and identifies the strongest combinations of attacks.

V. CASE STUDY
The proposed approach was tested for analysis of the distribution of pandemic software in the smart grid network with arbitrary structure.To evaluate the possibilities and potential of the proposed method, we consider a hypothetical network diagram (Fig. 1).

Figure 1 -An example of a network diagram
The network structure has 10 buses, 6 transformers and 14 transmission lines, which are given unique numbers corresponding to the connection points of the respective i-j buses.Taking into account accepted assumptions, this scheme can be represented in the form of a graph (Fig. 2) for which the corresponding adjacency and incident matrix is formed.Then, for the graph shown in Fig. 2, the sets of reach of each node can be represented as follows:  ( )       Sets for input streams: ( ) As can be seen from the above, the sets of reachability alone do not make it possible to decide on the importance of the peaks in terms of their availability for the pandemic distribution of malicious software.
For the scheme in Fig. 1, the second step gives us 24 combinations of attacks for a single shutdown.Considering the unpredictability of E-2 (assuming k = 2), the number of possible combinations of simultaneous attacks will be [27]: which actually proves how much damage can be done to power systems using well-planned attacks.
The next step of the proposed method is to create combinatorial matrices.The formation of combinational transition matrices is performed according to the incident matrices obtained in the previous step.As a result, we get the following distributions of conversion values.The next step is computing the importance of the nodes, which is determined by the number of output edges, taking into account their connectivity.This task is iterative and causes the use of granulation to achieve maximum processing speed with the maximum degree of systematic abstraction [28].

Input transition matrix
According to [29], the granular processing principle involves solving a problem for a single node and then extrapolating the output to the nodes of the entire graph.The algorithm of calculation thus coincides with six stages [30].

PR v e
where ei,j = 1 if node vi is the end of an arc joining nodes vi and vj, and ei,j = 0 otherwise.Stage 3.For each source of input edges, the total number of output edges d0(vj) is calculated: where ei,j = 1, if node vj is the beginning of an arc connecting nodes vi and vj, and ei,j = 0 otherwise.
ε where İ is the error limit, an arbitrary value from 0 to 1.The smaller the error limit, the more accurate the result.

VI. DISCUSSION
As a result of the redistribution, it becomes possible to determine the order of verification of nodes by their importance / criticality for the system as a whole.Another option to apply the proposed approach and its natural evolution is to evaluate the strongest combinations of network attacks.In this context, it is possible to utilise relatively recently presented approaches [25,31], which enable the modeling of attacks on individual branches of the graph and assessing the potential harm from their implementation.
As a rule, the electrical topological structure of the energy system is static (updated only when new elements are included) and contains detailed information about system assets, their characteristics and configurations.The only dynamically changing parameter is the state of the network switching equipment.Obviously, in a power system, the status of network switches only changes when the system is reconfigured, which also does not happen often, while the modules of the power management system perform their operations very often, even every few minutes, depending on the program.With this in mind, the best target for cyberattacks is to disable or switch branches from working state to off state.This, in turn, can cause an overload and a series of successive (cascading) system shutdowns.Thus, after investigating the most important peaks and fixing the types of attacks that may be attacked by the system, it is necessary to measure the changes in the system's power supply parameters (load, percentage of lost MW, etc.).Then, from the data obtained, the damage level can be calculated, and the strongest combinations of attack determined.Fig. 3 illustrates a situation where simultaneous switching off of lines {3-4, 3-5, 3-10} will cause cascade shutdown of lines {3-5, 10-7, 10-6, 4-8, 4-9} (shown by a dashed line).A fragment of data with variants of cascading disconnections for the topological scheme shown in Fig. 1    As can be seen from the Table VI, one of the key points of this analysis is that from the point of view of the attacker, it is enough to overload several lines to cause maximum damage.For example, a nozzle to cause a cascade shutdown of branches 3-1, 3-5, 10-6, 10-7, 4-8, 4-9 is enough to turn off two lines 3-4, 3-10, instead of a set of 3-4, 3-5, 3-10.
More detailed analysis, using flow directions, gives another interesting observation -a line overloaded during a particular line cutoff (N-1) may not be overloaded for a combination of previous line cutouts.But it is obvious that N-2 outages are more harmful to the system than N-1 when considering their overall impact.

VII. CONCLUSIONS
The results of testing the method determined that one of the key points of the proposed approach is that from the point of view of the attacker, it is enough to overload several lines to cause maximum damage to the system.A more detailed analysis, using the flow directions, provided another observation -a line overloaded during a particular line cutoff (N-1) may not be overloaded for a combination of previous line cutoffs.However, it has been confirmed that N-2 outages are more harmful to the system than N-1 when considering their overall impact.
The simulation results allow us to conclude that the proposed structures can be used to find and identify the strongest combinations of attacks capable of causing the maximum system damage.To solve higher-level decisionmaking problems, such as obtaining realistic data on changes to system characteristics in the presence of cyberinterference, it is useful to use another type of model that directly describes physical processes in the system.

Step 4 .
Calculate the importance of the nodes

Step 5 .
iteratively finds different values for each t until the values converge, i.e. | (PR (vi), t) -(PR (vi), t -1) | 1 <İ, where İ is the permissible error.Redistribute the nodes of the graph by their criticality for the invasion.At this stage, the nodes of the graph are sorted by PR and decisions are made on what to do next.

Stage 1 .
Select a random node vi.Stage 2. Compute all edges included in this node:

Stage 4 .
Compute the importance of the PR (vi) of the node associated with the node vj, where do(vj) is the number of output edges from the node vj.Stage 5.For all nodes repeat Stages 1-3.Stage 6. Check the convergence of results.The calculations are completed when the conditions are reached 1 1-5 are repeated.It is believed that convergence occurs when all ranks are within the error boundary
is presented in TableVI.

TABLE VI .
DATA SNIPPET WITH CASCADE SHUTDOWN OPTIONS